Goto

Collaborating Authors

 Real Estate


A Supplementary Material for Interior Point Solving for LP-based prediction+optimisation

Neural Information Processing Systems

A.1 Solution of Newton Equation System of Eq. (11) Here we discuss how we solve an equation system of Eq (11), for more detail you can refer to[4]. Consider the following system with a generic R.H.S- [ ] [ ] A 0 A.3 LP formulation of the Experiments A.3.1 Details on Knapsack formulation of real estate investments In this problem, H is the set of housings under consideration. A.3.2 Details on Energy-cost aware scheduling In this problem J is the set of tasks to be scheduled on M number of machines maintaining resource requirement of R resources. The tasks must be scheduled over T set of equal length time periods. The first constraint ensures each task is scheduled and only once.


Interior Point Solving for LP-based prediction+optimisation

Neural Information Processing Systems

Solving optimization problems is the key to decision making in many real-life analytics applications. However, the coefficients of the optimization problems are often uncertain and dependent on external factors, such as future demand or energy or stock prices. Machine learning (ML) models, especially neural networks, are increasingly being used to estimate these coefficients in a datadriven way. Hence, end-to-end predict-and-optimize approaches, which consider how effective the predicted values are to solve the optimization problem, have received increasing attention. In case of integer linear programming problems, a popular approach to overcome their non-differentiabilty is to add a quadratic penalty term to the continuous relaxation, such that results from differentiating over quadratic programs can be used. Instead we investigate the use of the more principled logarithmic barrier term, as widely used in interior point solvers for linear programming. Specifically, instead of differentiating the KKT conditions, we consider the homogeneous self-dual formulation of the LP and we show the relation between the interior point step direction and corresponding gradients needed for learning. Finally our empirical experiments demonstrate our approach performs as good as if not better than the state-of-the-art QPTL (Quadratic Programming task loss) formulation of Wilder et al. [29] and SPO approach of Elmachtoub and Grigas [12].


Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach

Neural Information Processing Systems

Web-scale visual entity recognition, the task of associating images with their corresponding entities within vast knowledge bases like Wikipedia, presents significant challenges due to the lack of clean, large-scale training data. In this paper, we propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation. Instead of relying on the multimodal LLM to directly annotate data, which we found to be suboptimal, we prompt it to reason about potential candidate entity labels by accessing additional contextually relevant information (such as Wikipedia), resulting in more accurate annotations. We further use the multimodal LLM to enrich the dataset by generating question-answer pairs and a grounded finegrained textual description (referred to as "rationale") that explains the connection between images and their assigned entities. Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks (e.g.


PuzzleFusion: Unleashing the Power of Diffusion Models for Spatial Puzzle Solving

Neural Information Processing Systems

This paper presents an end-to-end neural architecture based on Diffusion Models for spatial puzzle solving, particularly jigsaw puzzle and room arrangement tasks. In the latter task, for instance, the proposed system takes a set of room layouts as polygonal curves in the top-down view and aligns the room layout pieces by estimating their 2D translations and rotations, akin to solving the jigsaw puzzle of room layouts. A surprising discovery of the paper is that the simple use of a Diffusion Model effectively solves these challenging spatial puzzle tasks as a conditional generation process. To enable learning of an end-to-end neural system, the paper introduces new datasets with ground-truth arrangements: 1) 2D Voronoi jigsaw dataset, a synthetic one where pieces are generated by Voronoi diagram of 2D pointset; and 2) MagicPlan dataset, a real one offered by MagicPlan from its production pipeline, where pieces are room layouts constructed by augmented reality App by real-estate consumers. The qualitative and quantitative evaluations demonstrate that our approach outperforms the competing methods by significant margins in all the tasks. We have provided code and data here.


Convex Elicitation of Continuous Properties

Neural Information Processing Systems

A property or statistic of a distribution is said to be elicitable if it can be expressed as the minimizer of some loss function in expectation. Recent work shows that continuous real-valued properties are elicitable if and only if they are identifiable, meaning the set of distributions with the same property value can be described by linear constraints. From a practical standpoint, one may ask for which such properties do there exist convex loss functions. In this paper, in a finite-outcome setting, we show that in fact essentially every elicitable real-valued property can be elicited by a convex loss function. Our proof is constructive, and leads to convex loss functions for new properties.



DOGE Put a College Student in Charge of Using AI to Rewrite Regulations

WIRED

A young man with no government experience who has yet to even complete his undergraduate degree is working for Elon Musk's so-called Department of Government Efficiency (DOGE) at the Department of Housing and Urban Development (HUD) and has been tasked with using artificial intelligence to rewrite the agency's rules and regulations. Christopher Sweet was introduced to HUD employees as being originally from San Francisco and most recently a third-year at the University of Chicago, where he was studying economics and data science, in an email sent to staffers earlier this month. "I'd like to share with you that Chris Sweet has joined the HUD DOGE team with the title of special assistant, although a better title might be'Al computer programming quant analyst,'" Scott Langmack, a DOGE staffer and chief operating officer of an AI real estate company, wrote in an email widely shared within the agency and reviewed by WIRED. "With family roots from Brazil, Chris speaks Portuguese fluently. Please join me in welcoming Chris to HUD!" Sweet's primary role appears to be leading an effort to leverage artificial intelligence to review HUD's regulations, compare them to the laws on which they are based, and identify areas where rules can be relaxed or removed altogether.


Can Moran Eigenvectors Improve Machine Learning of Spatial Data? Insights from Synthetic Data Validation

arXiv.org Machine Learning

Moran Eigenvector Spatial Filtering (ESF) approaches have shown promise in accounting for spatial effects in statistical models. Can this extend to machine learning? This paper examines the effectiveness of using Moran Eigenvectors as additional spatial features in machine learning models. We generate synthetic datasets with known processes involving spatially varying and nonlinear effects across two different geometries. Moran Eigenvectors calculated from different spatial weights matrices, with and without a priori eigenvector selection, are tested. We assess the performance of popular machine learning models, including Random Forests, LightGBM, XGBoost, and TabNet, and benchmark their accuracies in terms of cross-validated R2 values against models that use only coordinates as features. We also extract coefficients and functions from the models using GeoShapley and compare them with the true processes. Results show that machine learning models using only location coordinates achieve better accuracies than eigenvector-based approaches across various experiments and datasets. Furthermore, we discuss that while these findings are relevant for spatial processes that exhibit positive spatial autocorrelation, they do not necessarily apply when modeling network autocorrelation and cases with negative spatial autocorrelation, where Moran Eigenvectors would still be useful.


Are We Taking A.I. Seriously Enough?

The New Yorker

My in-laws own a little two-bedroom beach bungalow. It's part of a condo development that hasn't changed much in fifty years. The units are connected by brick paths that wind through palm trees and tiki shelters to a beach. Nearby, developers have built big hotels and condo towers, and it's always seemed inevitable that the bungalows would be razed and replaced. But it's never happened, probably because, according to the association's bylaws, eighty per cent of the owners have to agree to a sale of the property.


Convex Elicitation of Continuous Properties

Neural Information Processing Systems

A property or statistic of a distribution is said to be elicitable if it can be expressed as the minimizer of some loss function in expectation. Recent work shows that continuous real-valued properties are elicitable if and only if they are identifiable, meaning the set of distributions with the same property value can be described by linear constraints. From a practical standpoint, one may ask for which such properties do there exist convex loss functions. In this paper, in a finite-outcome setting, we show that in fact essentially every elicitable real-valued property can be elicited by a convex loss function. Our proof is constructive, and leads to convex loss functions for new properties.